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ABSTRACT 



A program transformation method for transforming a source 
program described by a programming language into an 
object program described by a language executable by a data 
processing system, includes a process of transforming at 
least a part of procedure, function or sub-routine used in the 
source program into a fonn so that the object program can 
be stored in an arbitrary storage region of a primary storage 
device of the data processing system, a process of arranging 
procedure, function or sub-routine transformed or not trans- 
formed in the first process in the storage region correspond- 
ing to cache line of a cache memory among storage region 
of the primary storage device without causing cache conflict 
on the basis of information relating to the procedure, func- 
tion or sub-routine obtained during a process of transforma- 
tion of the source program into the object program, and a 
process of generating the object program, on the basis of the 
result of arrangement. 

23 Claims, 16 Drawing Sheets 
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FIG. 2 



extern Int count = 0 ; 

void tunc () 

{ 

fund 0; 
func2 0; 
fund ( ) ; 

} 

void fund () 
{ 

count += 1 ; 

} 

void func2 () 
{ 

count += 2 ; 

} 
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FIG. 7 
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FIG. 8 



GR0UP1 : ILOAD ?RX AOxlOOO { 






_D_UB 




$ PROGBITS ?AX A0x20 


_D_ 


test1 {D. o (1ibc. a)} ; 


_G_test2 




$ PROGBITS ?AX A0x20 


_G_ 


test2 { test2. o } ; 


_E_te8t2 




$ PROGBITS ?AX A0x20 


_E_ 


testa { test2. o } ; 


_C_LIB 




$ PROGBITS ?AX A0x20 


_C_ 


test1 {Co (1ibc. a)} ; 


_B_test1 




$ PROGBITS ?AX A0x20 


_B_ 


test1 {testl.o} ; 


_F_test2 




$ PROGBITS ?AX A0x20 


_F_ 


test2 { test2. o } ; 


_A_test2 




$ PROGBITS ?AX A0x20 


_A_ 


test1 {testl.o} ; 



FIG. 9 















A 


B 


CI 


C2 


D1 


D2 


E1 


E2 


F 


G1 


G2 








V 









03/20/2004, EAST Version: 1.4.1 



U.S. Patent Aug. 28, 2001 Sheet 8 of 16 US 6,282,707 Bl 
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FIG. 18 (PRIOR ART) 

void tunc (void) 
{ 

func_A 0 ; 
func_B 0 ; 
func_A 0 ; 
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FIG. 16 (PRIOR ART) 
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PROGRAM TRANSFORMATION METHOD 
AND PROGRAM TRANSFORMATION 
SYSTEM 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates generally to a program 
transformation method, a program transformation system 
and a storage medium storing a program transformation 
program. More particularly, the invention relates to a pro- 
gram transformation method and a program transformation 
system for transforming (compiling) a source program 
described by a programming language into an object pro- 
gram described by a language (machine language, assembly 
language and so forth) executable by a computer, a central 
processing unit (CPU) and the like. 

2. Description of the Related Art 

FIG. 15 is a block diagram showing an example of a 
construction of the conventional program transformation 
system disclosed in Japanese Unexamined Patent PubUca- 
tion No. Heisei 1-118931. 

The program transformation system illustrated in FIG. 15 
is constructed with a first program storage portion 151, a 
compiler 152. a second program storage portion 153, a third 
program storage portion 154, an input data storage portion 
155, a program executing portion 156, a fourth program 
storage portion 157 and a parsing result storage portion 158. 

At first, the compiler 152 reads out a source program 
described by a programming language, such as C language 
and so forth from the first program storage portion 152, 
temporarily generates an object program described by a 
machine language, an assembly language and so forth, and 
stores the temporarily generated object program in the 
second program storage portion 153. 

Here, the temporarily generated object program is the 
program generated by transforming the source program into 
codes of machine language, assembly language or so forth 
in a sequential order of description. While the temporarily 
generated object program is executable by the computer, the 
central processing unit (CPU) and so forth, since the source 
program is simply transformed into the codes in a sequential 
order of that in the source program, it inherently has 
redundant portions to make the size (code size) of the overall 
object program large as held in the temporarily generated 
form. Therefore, a large storage capacity is required in a 
primary storage device which is adapted to store the tem- 
porarily generated object program. Furthermore, an execu- 
tion period of the object program becomes long to lower 
efficiency. 

Therefore, it becomes necessary to generate an efficient 
and optimal object program. The object program simply 
transformed into the codes from the source program in a 
sequential order described in the source program in the 
process set forth above, will be hereinafter referred to as 
"temporary object program" distinguishing fi"om an opti- 
mized final object program. 

There are various method for optimizing the object pro- 
gram. Here, arrangement optimization of instruction codes 
in a procedure. The procedure means a group of processes, 
such as arithmetic operation, to be executed by the computer 
or CPU and is often called as function or sub-routine. 
ITiroughout the disclosure and claims, the group of pro- 
cesses will be generally referred to as "proceduire". 
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In a program, it can become necessary to call other 
procedure (hereinafter referred to as "callee" side procedure) 
in execution of some procedure (hereinafter referred to as 
"caller" side procedure) at a certain portion of the program. 

5 Therefore, when the source program is transformed into the 
object program and the resultant object program is stored in 
the primary storage device, if an instruction code of the 
caller side procedure and an instruction code of the caller 
side procedure closely related to the former are physically 

10 ananged close with each other, a procediu"e call instruction 
can be changed from that for long jump to that for short 
jump. 

By this, the code size of the overall object program can be 
reduced. In conjunction therewith, an execution speed upon 

15 executing the object program in the computer or the CPU 
can be higher. Arranging of the instmction codes having 
high possibility to be sequentially executed in time at 
physically close positions on the object program is called as 
arrangement optimization of the instruction codes of the 

20 procedures. 

Next, the program executing portion 156 reads out a 
procedure caU fi-equency parsing program from the fourth 
program storage portion 157 and executes the same. 
Namely, the program executing portion 156 reads out the 

2^ temporary object program from the second program storage 
portion 153. In conjunction therewith, an input data stored in 
the input data storage portion 155 input by an operator is 
read out by the program executing portion 156. Then, the 
program executing portion 156 simulates execution of the 
temporary object program and, in conjunction therewith, 
integrates number of times of occurrence of call of other 
procedures in a certain procedure in the temporary object 
program. A result of integration is stored in the parsing result 
storage portion 158 as a procedure reference frequency 
parsing result. 

By this, the compiler 152 reads out the procedure refer- 
ence frequency parsing result from the parsing result storage 
portion 158 to calculates closeness of reference relationship 
between arbitrary two procedures. On the basis of a resultant 
closeness, arrangement optimization of the instmction code 
is performed to generate the final objective program to store 
in the third program storage portion 154. 

On the other hand, FIG. 16 is a block diagram showing an 
example of a construction of the conventional program 
transformation system disclosed in Japanese Unexamined 
Patent Publication No. Heisei 9-34725. 

The program transformation system illustrated in FIG. 16 
is constructed with a source program storage portion 161, a 
5Q compiler 162 and an object program storage portion 163, in 
general. 

The compiler 162 is generally constructed with a parsing 
portion 164, a procedure call occurrence counting portion 
165, a code generating portion 166, a procedure call count 

55 data storage portion 167, a special space arranged procedure 
determining portion 168, an object program outputting por- 
tion 169. Here, a special space means a special region of a 
finite code size set in a part of a program space. 
Th& parsing portion 164 reads out the source program to 

60 be parsed from the source program storage portion 161 and 
parses a syntax forming the source program. The procedure 
call count portion 165 counts number of limes of call of 
respective procedure per procedure recognized by the pars- 
ing portion 164 upon parsing the syntax. 

65 The code generating portion 166 performs code genera- 
tion twice. Namely, at first code generation, the code gen- 
erating portion 166 generates a normal code if the syntax is 
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not the procedure call instruction, and generates an instruc- storage device so that each code of the object program is 
lion code using normal call instruction if the syntax is the read out sequentially from the primary storage device. Then, 
procedure call instruction, on the basis of the result of after decoding by the decoder, the CPU parses the object 
parsing of the parsing portion 164. On the other hand, the program for execution. In this case, in order to speed-up the 
code generating portion 166 scans the results of code 5 execution speed of the CPU, a cache memory for tempo- 
generation in the first time from the leading end in the rarily having small storage capacity and high access speed 
second code generation. Then, if the code is the procedure and storing the codes read out from the primary storage 
call instruction and, a resuh of inquiring to the special space memory which normally has large storage capacity and low 
arranged procedure determining portion 168 shows that the access speed, is provided in the one -chip microcomputer, 
procedure is a special space arranged procedure determined 10 In the one-chip microcomputer provided with the cache 
to be arranged within the special space, a normal call memory, when the CPU executes the code, each code in the 
instruction code having large byte count is replaced with a object program read out from the primary storage device 
dedicated call instruction code having smaller byte count. cannot be decoded by the decoder and parsed and executed 
The procedure call count data storage portion 167 stores by the CPU until it is once stored in the cache memory. In 
a call count counted by the procedure call occurrence 35 the one-chip microcomputer of this kind, there are various 
counting portion 165 per procedure and a code size of the method to store each code read out from the primary storage 
code generated in the first code generation. The special device in the cache memory. Amongst, a direct map method 
space arranged procedure determining portion 168 selects is one of the method for storing each code in the cache 
and determines a procedure to be arranged within the special memory. 

space with providing preference for the procedure having 20 ^ shown in FIG. 17, in the direct map method, a cache 

greater call count so that a sum of the code sizes of the memory 171 is divided into a plurality of storage regions 

procedures to be arranged within the special space falls (hereinafter referred to as cache lines). In conjunction 

within a code size of the special space on the basis of call therewith, each storage region of the primary storage device 

count and code side per procedure stored in the procedure 172 is also divided. Each storage region of the primary 
call count data storage portion 167. 25 storage device 172 is established correspondence to each 

The object program output portion 169 outputs the code cache line of the cache memory 171. 
to a segment added an arrangement attribute to the special In FIG. 17, the cache memory 171 is consisted of five 

space when a resuh of inquiry to the special space arranged cache lines 171a to 171e. Corresponding to these, the 

procedure determining portion 168 shows the code gener- primary storage device 172 is divided into storage regions 

ated by the code generating portion 166 is the code of a each having the same storage capacity to that of each cache 

definition portion of the special space arranged procedure, line. Each storage region is corresponded to respective five 

and when the code generated by the code generating portion cache lines 111a to 171e with taking five as a unit. Namely, 

166 is not the code of the definition portion of the special storage regions 172-lfl to 172-fiist embodiment of the 

space arranged procedure, a normal segment is output. Here, primary storage device 172 are corresponded to the cache 

the segment means a group of codes as minimum unit of lines 17 la to 171e as a group. Similarly, the storage regions 

arrangement when the code is arranged within the program 172-2a to 172-second embodiment are corresponded to the 

space. cache lines 171fl to 171e. Final storage regions 172 -na to 

Asset forth above, the object program output portion 166 172-ne (n is natural number) are also corresponded to the 

separates the special space arranged procedures and the cache lines 111a to 171e 

normal procedures. Next, the object program output portion When the object program to be executed by one-chip 

169 outputs data of parameter region or so forth, outputs a microcomputer employing the direct map method is to be 

code portion and a data portion in combination as object generated using the program transformation system, the 

program, and stores in an object program storage portion following drawbacks should be encountered, 

45 For example, when the source program described by C 

With the constmction set forth above, the code size of the language shown in FIG. 18 is transformed into the object 

generated object program can be reduced. Associating with program by the program transformation system, as shown in 

this, the program space can be saved. Also, the execution FIG. 19, respective instruction codes of procedure fimc_A 

speed upon execution of the object program by the computer and func_B are stored in the primary storage device 172. 
or the CPU can be higher. In pjc. 19, the instruction code of the procedure fiinc_A 

On the other hand, in the conventional program transfor- is stored in the storage regions 172-lfl to 172-lc of the 

matioo system disclosed in Japanese Unexamined Patent primary storage device 172. Also, the instruction code of the 

Publication No. Heisei 1-118931, since an object to perform procedure func_B is stored in the storage regions 

arrangement optimization of the instruction code of the 172-2fl _to 172-26 of the primary storage device 172. 
procedure is only procedure which the user defines in the 55 Accordingly, the instruction code of the procedure fiinc_A 

source program, improvement of efficiency of the object is corresponded to the cache memory line 171a to 171c of 

program is limited. the cache memory 171. On the other hand, the instruction 

On the other hand, in the conventional program transfor- code of the procedure func_B is corresponded to the cache 

mation system disclosed in Japanese Unexamined Patent line 171a and 1716 of the cache memory 171. 
Publication No. Heisei 9-34725, since the special space has eo In such case, when the CPU executes the object program 

a finite code size, the procedures to be arranged within the generated by transformation of the source program shown in 

special space are limited. Therefore, improvement of effi- FIG. 18, the instruction code of the procedure fiinc_A is 

ciency of the object program is limited. read out from the storage regions 172-la to 172-lc of the 

On the other hand, when the object program generated by primary storage device 172 and is once stored in the cache 
the program transformation system is to be executed by a 65 lines 171a to 171c in the cache memory 171, and thereafter 

one-chip microcomputer consisted of CPU, decoder and so decoded by the decoder and parsed and executed by the 

forth, the object program is stored in the external primary CPU. 
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Next, the instruction code of the procedure func_B is On the other hand, the procedure to be frequently used in 

read out from the storage regions 172-2fl to 172-2Z? of the execution of the object program cannot speed up the execu- 

primary storage device 172, and temporarily stored in the tion speed of the CPU by reading the instruction code from 

cache lines 171fl to 171b of the cache memory 171. Here, the primary storage device 172 and storing in the corre- 
while a part of the instruction codes of the procedure ^ spending cache lines of the cache memory every time of use, 

func_A has already stored in the cache lines 171a and 1716 since the instruction code is not stored in the cache memory 

of the cache memory 171, the instruction code of the 171, the instruction code cannot be read out for conflict 

procedure func_B is stored there over (overwritten). (these are generally called as cache miss). Therefore, it 

Therefore, a part of the instruction code of the procedure becomes necessary to store the frequently used procedure in 

func_A cannot be read subsequently. Thereafter, the instruc- the cache memory 171 as long as possible without causing 

tion code of the procedure func__B stored in the cache lines conflict, 

171a to 1716 of the cache memory bl71 are decoded by the However, in the conventional program transformation 

decoder and parsed and executed by the CPU. systems disclosed in Japanese Unexamined Patent Publica- 

Nexl, by the source program shown in HG. 18, the tion No. Heisei 1-118931 and Japanese Unexamined Patent 

instruction code of the procedure func_A has to be executed Publication No. Heisei 9-34725, nothing is considered with 

again. However, since the instruction code of the procedure respect to the cache miss in execution of the object, 

funcj is akeady stored in the cache lines 171fl and 1716 Accordingly, even in this point, execution speed of the CPU 

of the cache memory 171, a part of the instruction code of cannot be speed-up. 

the procedure func_A cannot be read out. Therefore, the c-uxaxm a^^j^ r\r tut- Txrtn-xrrry^xT 

• . J fT^I J A ■ J * SUMMARY OF THE INVENTION 
mstruction code of the procedure runc_^ is again read out 

from the storage regions 172-1^2 to 172-16 of the primary It is an object of the present invention to provide a 

storage device 172. Then, the instruction code of the pro- program transformation method and a program transforma- 

cedure func_A is temporarily stored in the cache lines 171fl tion system which can successfully prevent conflict of 

to 1716 of the cache memory 171, decoded by the decoder variation procedures on the cache memory, can prevent 

and parsed and executed by the CPU. cache miss of the frequently used procedure and whereby 

As set forth above, when the instruction codes of two speed up execution of an object program by a computer, 

procedures which have high possibility to be executed ^ ^r the like. 

sequentially in time, are stored in the storage regions of the 30 According to the first aspect of the invention, a program 
primary storage device 172 corresponding to the same cache transformation method for transforming a source program 
lines of the cache memory (this will be referred to as being described by a programming language into an object pro- 
loaded on the same cache line), all or a part of the instruction gram described by a language executable by a data process- 
codes stored in the cache memory 171 read our from the ing system, comprises 

primary storage device preliminarily, is overwritten by the 35 first process of transforming at least a part of procedure, 

subsequently written instruction codes of the procedure read function or sub-routine used in the source program into 

out from the primary storage device and written on the same a form so that the object program can be stored in an 

cache line of the cache memory 171. Such condition is arbitrary storage region of a primary storage device of 

referred to as conflict (cache conflict). If such conflict is the data processing system, 

caused frequently, effect of the cache memory for speeding second process of arranging procedure, function or sub- 
up execution speed of the CPU can be negated. More routine transformed or not transformed in the first 
worsely, it is possible to cause slow down of the execution process in the storage region corresponding to cache 
speed of the CPU. line of ^ cache memory among storage region of the 
As methods for storing the cache memory of each code primary storage device without causing cache conflict 
read out from the primary storage device, there are a fully 45 on the basis of information relating to the procedure, 
associative method which permits storing of data of the function or sub-routine obtained during a process of 
primary storage device to any of the cache line on the cache transformation of the source program into the object 
memory, a set associative method as an intermediate method program, and 

of the direct map method and the fully associative method third process of generating the object program, on the 

and a plurality of cache lines of the cache memory to be 50 basis of the result of arrangement, 

arranged the data of the primary storage device are present, in the preferred construction, the procedure, function or 

and so forth may be used in addition to the direct map sub-routine is at least one of that defined by a user in the 

method. As set forth above, it is possible to cause conflict of source program, that defined and inspected by the user, that 

procedures on the cache memory. preliminarily prepared in a processing system in the pro- 
In the conventional program transformation systems dis- 55 gramming language and that preliminarily prepared in a 

closed in Japanese Unexamined Patent Publication No. form of instruction code. 

Heisei 1-118931 and Japanese Unexamined Patent Publica- In another preferred construction, the information is 
tion No. Heisei 9-34725, no consideration has been given for obtained by execution of a temporary object program trans- 
conflict as set forth above. Therefore, as a result of arrange- formed from the source program and is consisted of infor- 
ment optimization of the instruction code of the procedure or 60 mation indicative of number of times that the procedure, 
arrangement of the procedure in the special space, if the function or sub -routine is actually called and information 
instruction codes of two procedures having high possibility indicative of call relationship between proceduires, functions 
to be executed sequentially in time are loaded on the same or sub-routines. 

cache line of the cache memory 171, conflict is inherent. In another preferred construction, the information is 
Accordingly, even if the code size of the overall object 65 obtained by execution of a temporary object program trans- 
program can be deleted, execution speed of the CPU caimot formed from the source program and is consisted of infor- 
be accelerated. mation indicative of number of times that the procedure, 
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function or sub-routine is actually called and infonnation 
indicative of call relationship between procedures, functions 
or sub-routines, 
in the second process, 

the procedures, functions or sub -routines are divided into 
a plurality of groups on the basis of call frequency, and 
the procedures, functions or sub-routines are arranged in 
the storage region corresponding to cache lines of a 
cache memory among the storage region of the primary 
storage device. 
According to the second aspect of the invention, a pro- 
gram transformation method for transforming a source pro- 
gram described by a programming language into an object 
program described by a language executable by a data 
processing system, comprises 

first process of transforming at least a part of procedure, 
function or sub-routine used in the source program into 
a form for storing in an arbitrary storage region of a 
primary storage device of the data processing system 
when the object program is used in the data processing 
system, 

second process of transforming the source program into 
the object program, and in conjunction therewith, and 
concerning the object program, transforming 
procedure, function or sub-routine defined by a user in 
the source program into a form slorable in arbitrary 
region of the primary storage device, 

third process of linking the procedure, function or sub- 
routine transformed into the first process and the object 
program obtained in the second process, 

fourth process of collecting dynamic information con- 
sisted of information indicative of nmnber of times that 
the procedure, function or sub-routine is actually 
called, and information indicative of call relationship 
between the procedure, function or sub-routine with 
executing the object program obtained through the third 
process, 

fifth process of arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 
and 

sixth process of generating a final object program by 
linking the procedure, function or sub-routine trans- 
formed in the first process and the object program 
obtained in the second process, on the basis of the 
arrangement information. 
In the preferred construction, the procedure, function or 
sub-routine is at least one of that defined by a user in the 
source program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in the pro- 
gramming language and that preliminarily prepared in a 
form of instruction code. 

In another preferred construction, in the second process, 
the procedures, functions or sub-routines are divided into 
a plurahty of groups on the basis of call frequency, and 
the procedures, functions or sub-routines are arranged in 
the storage region corresponding to cache fines of a 
cache memory among the storage region of the primary 
storage device. 
According to the third aspect of the invention, a program 
transformation method for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprises 
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first process of transforming the source program into a 
temporary object program, and in conjunction 
therewith, upon executing the temporary object 
program, inserting a code for counting number of times 
that the procedure, function or sub-routine is actually 
called, 

second process of linking one of the procedure, function 
or sub-routine that defined by a xiser in the source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in the 
programming language and that preliminarily prepared 
in a form of instruction code, with the temporary object 
program obtained through the first process, 

third process of collecting dynamic information consisted 
of informafion indicative of number of times that the 
procedure, function or sub-routine is actually called, 
and information indicative of call relationship between 
the procedure, function or sub-routine with executing 
the object program obtained through the second 
process, 

fourth process of arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 
and 

fifth process of transforming at least part of one defined by 
a user in the source program, one defined and inspected 
by the user, one prefiminarily prepared in a processing 
system in the programming language and one prelimi- 
narily prepared in a form of instruction code among the 
procedure, function or sub-routine to be used in the 
source program into a form storable in an arbitrary 
storage region of the primary storage region, in which 
the object program is stored as actually used in the data 
processing system, 

sixth process, after transforming the source program into 
the object program, concerning the object program, 
transforming procedure, function or sub-routine 
defined by a user in the source program into a form 
storable in arbitrary region of the primary storage 
device, 

seventh process of generating a final object program by 
linking the procedure, function or sub-routine trans- 
formed in the fifth process and the object program 
obtained in the sixth process, on the basis of the 
arrangement information. 

In the preferred construction, in the fourth process, 

the procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

the procedures, fimctions or sub-routines are arranged in 
the storage region corresponding to cache lines of a 
cache memory among the storage region of the primary 
storage device. 

According to the fourth aspect of the invention, a program 
transformation system for transfonning a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprises 

procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in the 
source program into a form so that the object program 
can be stored in an arbitrary storage region of a primary 
storage device of the data processing system, 
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Optimizing means for arranging procedure, fiinction or 
sub-routine transformed or not transformed in the pro- 
cedure transforming means in the storage region cor- 
responding to cache line of a cache memory among 
storage region of the primary storage device without 5 
causing cache conflict on the basis of information 
relating to the procedure, function or sub-routine 
obtained during a process of transformation of the 
source program into the object program, and 
generating means for generating the object program, on 10 

the basis of the result of arrangement. 
In the preferred construction, the procedure, function or 
sub -routine is at least one of that defined by a user in the 
source program, that defined and inspected by the user, that 
preliminarily prepared on a processing system in the pro- 15 
gramming language and that preliminarily prepared in a 
form of instruction code. 

In another preferred construction, the information is 
obtained by execution of a temporary object program trans- 
formed from the source program and is consisted of infor- 20 
mation indicative of number of times that the procedure, 
function or sub-routine is actually called and information 
indicative of call relationship between procedures, functions 
or sub-routines. 

In another preferred construction, the information is 25 
obtained by execution of a temporary object program trans- 
formed from the source program and is consisted of infor- 
mation indicative of number of times that the procedure, 
function or sub-routine is actually called and information 
indicative of call relationship between procedures, functions ^0 
or sub-routines, 

in the optimizing means, 

the procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

the procedures, functions or sub-routines are arranged in 
the storage region corresponding to cache lines of a 
cache memory among the storage region of the primary 
storage device. 

According to the fifth aspect of the invention, a program 
transformation system for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprise 

procedure transforming means for transforming at least a 45 
part of procedure, function or sub-routine used in the 
source program into a form for storing in an arbitrary 
storage region of a primary storage device of the data 
processing system when the object program is used in 
the data processing system, 

program transforming means for transforming the source 
program into the object program, and in conjunction 
therewith, and concerning the object program, trans- 
forming procedure, function or sub-routine defined by 
a user in the source program into a form storable in 55 
arbitrary region of the primary storage device, 

linking means for linking the procedure, function or 
sub-routine transformed into the procedure transform- 
ing means and the object program obtained in the 
program transforming means, 60 

dynamic information collecting means for collecting 
dynamic information consisted of information indica- 
tive of number of times that the procedure, function or 
sub-routine is actually caUed, and information indica- 
tive of call relationship between the procedure, func- 65 
tion or sub-routine with executing the object program 
obtained through the linking means, and 



35 



optimizing means for arranging the procedure, function or 
sub -routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 

the linking means generating a final object program by 
linking the procedure, function or sub-routine trans- 
formed in the procedure transforming means and the 
object program obtained in the program transforming 
means, on the basis of the arrangement information. 

According to the sixth aspect of the invention, a program 
transformation system for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, comprises 

program transforming means for transforming the source 
program into a temporary object program, and in con- 
junction therewith, upon executing the temporary 
object program, inserting a code for counting number 
of times that the procedure, function or sub-routine is 
actually called, 

linking means for linking one of the procedure, function 
or sub-routine that defined by a user in the source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in the 
programming language and that preliminarily prepared 
in a form of instruction code, with the temporary object 
program obtained through the program transforming 
means, 

dynamic information collecting means for collecting 
dynamic information consisted of information indica- 
tive of number of times that the procedure, function or 
sub -routine is actually called, and information indica- 
tive of call relationship between the procedure, func- 
tion or sub-routine with executing the object program 
obtained through the linking means, 

optimizing means for arranging the procedure, function or 
sub-routine in the storage region corresponding to the 
cache line of the cache memory among the storage 
region of the primary storage device with avoiding 
cache conflict, on the basis of the dynamic information, 
and 

procedure transforming means for transforming at least 
part of one defined by a user in the source program, one 
defined and inspected by the user, one preliminarily 
prepared in a processing system in the programming 
language and one preliminarily prepared in a form of 
instruction code among the procedure, function or 
sub-routine to be used in the source program into a 
form storable in an arbitrary storage region of the 
primary storage region, in which the object program is 
stored as actually used in the data processing system, 

the program transforming means transforming the source 
program into the object program, concerning the object 
program, transforming procedure, function or sub- 
routine defined by a user in the source program into a 
form storable in arbitrary region of the primary storage 
device, and the linking means generating a final object 
program by linking the procedure, function or sub- 
routine transformed in the procedure transforming 
means and the object program obtained in the program 
transforming means, on the basis of the arrangement 
information. 

According to another aspect of the invention, a computer 
readable memory storing a language processing program for 
transforming a source program described by a programming 
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language into an object program described by a language FIG. 12 is a block diagram showing a construction of the 

executable by a data processing system, the language pro- second embodiment of a program transformation system 

cessing program comprises according to the present invention; 

first process of transforming at least a part of procedure, FIG. 13 is a block diagram showing a construction of the 
function or sub-routine used in the source program into 5 third embodiment of a program transformation system 

a form so that the object program can be stored in an according to the present invention; 

arbitrary storage region of a primary storage device of FIG. 14 is a flowchart showing operation of the third 

the data processing system, embodiment of the program transformation system of FIG. 

second process of arranging procedure, function or sub- 13; 

routine transformed or not transformed in the first FIG. 15 is a block diagram showing a first example of the 

process in the storage region corresponding to cache construction of the conventional program transformation 

line of a cache memory among storage region of the system; 

primary storage device without causing cache conflict piG. 16 is a block diagram showing a second example of 

on the basis of information relating to the procedure, the construction of the conventional program transformation 

function or sub-routine obtained during a process of system; 

transformation of the source program into the object ^„ illustration for explaining a relationship 

program, and between a cache memory and a primary storage device in a 

third process of generating the object program, on the direct map method; 

basis of the result of arrangement. FIG. 18 is an illustration showing one example of the case 

Other objects, features and advantages of the present ^here a source program to be used in the prior art is 

mvention will become clear from the detaUed description expressed by C language; and 

given herebelow. ,^ explanatory illustration for explaining 

BRIEF DESCRIPTION OF THE DRAWINGS conflict between procedure on a cache memory. 

The present invention wiU be understood more fully from DESCRIPTION OF THE PREFERRED 

the detailed description given herebelow and from the EMBODIMENT 

accompanying drawings of the preferred embodiment of the The present invention wiU be discussed hereinafter in 

present invention, which, however, should not be taken to be detail in terms of the preferred embodiment of the present 
Umitative to the invention, but are for explanation and 30 invention with reference to the accompanying drawings. In 

understanding only. following description, numerous specific details are set 

In the drawings: forth in order to provide a thorough understanding of the 

FIG. 1 is a block diagram showing a construction of the present invention. It will be obvious, however, to those 

first embodiment of a program transformation system skilled in the art that the present invention may be practiced 

accordmg to the present invention; ^j^out these specific details. In other instance, weU-known 

no. 2 is an illustration showing one example of a source structures are not shown in detail in order to avoid unnec- 

program to be used in the first embodiment of the program ^^^^ ^^^^^ ^^^^ invention. 

transformation system of FIG. 1; /i-* . l j- .\ 

(First Embodiment) 

HG. 3 is a flowchart showing operation of the first 40 piG. 1 is a block diagram showing a construction of the 

embodiment of the program transformation system of fi^gt embodiment of a program transformation system 

FIG. 4 is a flowchart showing an arrangement optimizing according to the present invention, 

process of a procedure of an optimizing portion in the first The shown embodiment of a program transformation 

embodiment of the program transformation system of FIG. system is generaUy constructed widi first to fourth program 
1; 45 storage portions 31 to 34, a compiler 35, a linker 36, a 

FIG. 5 is an illustration showing one example of number profiler 37, first and second information storage portions 38 

of cache lines to be occupied by procedures A to G; and 39, an optimizing portion 40, first and second library 

FIG, 6 is an illustration showing one example of a storage portions 41 and 42, and a library generating portion 
procedure call graph to be generated by the optimizing 

portion; 50 The first program storage portion 31 is constructed with a 

HG. 7 is an explanatory iUustration for explaining the medium, such as a semiconductor memory including 

arrangement optimization process of the procedures in the ^^^^ f^^^^' ^ (floPPV ^i^k), a HD (hard 

optimizing portion in the first embodiment of the program ^^^)' CD-ROM or the like, m the first program storage 

transformation system shown in FIG. 1; P'^"*°" ^ program described by a programming 
o ■ 11 * *• L • 1 r 55 language, such as C language or SO forth, is stored prelimi- 

FIO. 8 IS an mustration showmg one example of an -i t *i. u tl j • * j- • n i_ • 

. . r ^ ^ narily. In the shown embodiment, discussion wiU be given 

arrangement information; c i .^1 b ^" 

^z, ^ . . .„ . ^ . . tor the case where C language is used as the programming 

FIG. 9 IS an explanatory illustration for explammg draw- language 

back to be caused when the arrangement optimizing process compiler 35 compQes the source program into a 

of the procedure is not performed; rearrangeable object program and thereafter transforms into 

HG. 10 IS an illustration showmg the case where proce- rearrangeable object program which can be arranged, per a 

dures C and D in the procedure call graph of FIG. 6 are procedure to store in the second program storage portion 32. 

standard library procedure; Here, rearrangeable object program is an object program 

FIG. 11 is an iUustration for explaining drawback in the which can be stored in any storage regions of the primary 
case where the standard library procedure is placed out of 65 storage device. Also, arrangeable per procedure means that 

object for the arrangement optimizing process of the proce- arrangement of the procedure can be done within the rear- 

dure; rangeable object program. 
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It should be noted that, in the shown embodiment, the The third program storage portion 33 is constructed with 

procedure generally represents not only the procedure as a storage medium, such as a semiconductor memory includ- 

origjnally means but also function and sub- routine, as set ing RAM or so forth, FD, HD or so forth and stores the 

forth above. In the procedure, a user procedure, a user temporary object program. The fourth program storage 

library procedure, a standard library procedure, a run-time 5 portion 34 is constructed with a storage medium, such as a 

library procedure and so forth are included. semiconductor memory including RAM or so forth, FD, HD 

Here, the user procedure is a procedure defined by the or so forth and stores the final object program, 

user in the source program. For example, when the user The profiler 37 is consisted of a hardware emulator, a 

prepares the source program shown in FIG. 2, procedures software emulator and so forth and collects dynamic infor- 

func, fund and func2 are all user procedures. 10 mation (profile information) consisted of call relationship 

The user library procedure is originally the user between procedures, call coimt of respective procedures, 

procedure, and are considered to have high general appli- loop structure information and so forth with executing the 

cability and thus are stored in the first library storage portion temporary object program read out from the third program 

41 after inspection, such as debugging or so forth. For storage portion 33. Then, the dynamic information thus 

example, when the source program is stored in the first is obtained is stored in the first information storage portion 38. 

library storage portion 41 with process, such as debugging Here, the loop strucmre information is information indi- 

or so forth after compiling the source program shown in eating that a certain procedure is called in the look structure, 

FIG. 2 into the rearrangeable object program, all of the Particularly, certain makers which enables recognition of 

procedures func, fund and func2 become user library starting and end of the look structure during operation of the 

procedure. The standard library procedure is the procedure 20 profiler 37, are written in the loop structure of the source 

preliminarily prepared in a processing system, such as program. Then, recognizing the start and end of the look 

compiler or so forth in the programming language describ- structure with the markers by the profiler 37, it can be 

ing the source program and can be used without definition by recognized that the procedure called in the loop structure is 

the user. For example, in C language, a procedure printf for in the loop structure. By this, it can be judged that possibility 

outputting a character string as standard output, a procedure 25 of sequential execution of the procedures within the loop 

strlen returning a length of the character string and so forth structure is high. 

are the standard library procedures. The first information storage portion 38 is constructed 

The run-time library procedure is the procedure in a form with a storage medium, such as a semiconductor memory 

preliminarily described by instruction code for large code including RAM or so forth, FD, HD and so forth, and stores 

size while general applicability is high and preliminarily ^o a dynamic information. 

stored in the first library storage portion 41, An instruction optimizing portion 40 performs arrangement optimi- 

string having high general applicabiUty and having large nation of all procedure on the basis of the dynamic infor- 

code size should lower efficiency if the compiler 35 gener- n^^^^n stored in the first information storage portion 38 for 

ates the instruction code every time of generation of the avoiding conflict of the procedures having high possibility to 

object program. Therefore, such Instruction string is pre- executed scquentiaUy in time axis on the cache memory 

liminarily established as the procedure described by the and for avoiding cache miss of the frequendy used proce- 

instruction code so that a code caUing such procedure is ^^^^ optimizing portion 40 generates an arrange- 

generaled upon generation of the object program. Then, such ^^^^ information for designating arrangement of the proce- 

procedure is linked by the linker 36 later. For example, float 40 dure to the linker 36 and stores the arrangement information 

type parameter or operation is described in the source thus generated in the second inforaiation storage portion 39. 

program despite of the fact that the CPU or the like execut- second infonmation storage portion 39 is constructed 

ing the final object program does not have instruction of with a storage medium, such as a semiconductor memory 

a \ ,i • t • * i -xe * *u u- ♦ mcludiug RAM or SO forth, FD, HD and SO forth, aid storcs 

floatmg decimal point, the compiler 35 generates the object , , ^ , . ^ ^. 

* . f , ^ ..^ ,,. o45the arrangement mformation. 

program usmg the procedure consisted of a plurality of ^^^^^^ ^^^^^^^ ^^^^^ 4^ ^ constructed with a 

instruction stnngs, such as float procedure add, float proce- storage medium, such as a semiconductor memory including 

dure sub and so forth. The float procedure add or float rqm, RAM or so forth, FD, HD, CD-ROM or so forth and 

procedure sub are run-time library procedures. stores the rearrangeable libraries including the standard 

The second program storage portion 32 is constructed 50 library procedure, the run-time library procedure and the 

with the storage medium, such as semiconductor memory user library procedure. Here, the rearrangeable library is 

including RAM or the like, FD, HD and so forth and stores rearrangeable object program. However, in order to distin- 

rearrangeable object program arrangeable per procedure. guish from the rearrangeable object program generated by 

The linker 36 establishes a link between the rearrangeable the compiler 35, the rearrangeable object program stored in 

object program arrangeable per procedure and stored in the 55 the first library storage portion 41 is referred to as the 

second program storage portion 32 and a rearrangeable rearrangeable library. 

library (which will be discussed later) arrangeable per The library generating portion 43 transforms the rear- 
procedure stored in the second library storage portion 42, to rangeable library stored in the first library storage portion 41 
generate an executable temporary object program to store in into the rearrangeable library arrangeable per procedure to 
the third program storage portion 33. In conjunction 60 store in the second library storage portion 42, The second 
therewith, on the basis of the arrangement information library storage portion 42 is constructed with the storage 
(which will be discussed later) stored in the second infor- medium, such as the semiconductor memory including 
mation storage portion 39, a link is established between the RAM or so forth, FD, HD or so forth and stores rearrange- 
rearrangeable object program arrangeable per procedure and able library arrangeable per procedure, 
the rearrangeable Ubrary arrangeable per procedure to gen- 65 Next, operation of the program transformation system 
erate an executable final object program to store in the fourth having the construction set forth above will be discussed 
program storage portion 34. with reference to FIGS. 3 to 10. 
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At first, at step 301 shown in FIG. 3, the library generating At step 306, the linker 36 establishes a link between the 

port on 43 transforms each rearrangeable library scored in rearrangeable object program arrangeable per procedure and 

the first library storage portion 41 into reanrangeable library the rearrangeable library arrangeable per procedure to gen- 

arrangeable per procedure, recognizing the pnDcedure unit to erate the executable final object program to store in the 

store in the second library storage portion 42. 5 fourth program storage portion 34. Thereafter, a sequence of 

A plurality of procedures as one group in one rearrange- process is terminated, 

able library normally are included in a section, such as text f^e^t, arrangement optimization process of the procedure 

section (.text, section) which is a unit of arrangement. These ^^^^^ optimizing portion 40 will be discussed with reference 

procedures are aggregated together as the text section upon to FIGS 4 to 10 

linkine in the linker 36. Each individual procedure cannot be • i * j r ^ ^ ' ^• 

iiuxs^u^ lu iiiv iiiiivvt uuiui ai^iv^ r u 10 There are various kmds of arrangement optimization 

arranged independently per the procedure. Therefore, by a f .u a c « • *i • u 

dividing each individual Action into sections per individual f ^ L^^' "^^^"^^^^ the cache 

procedure, each procedure can be appropriately arranged ^^^^^^ ^^^^ embodiment, arrangement method by 

upon establishing link by the linker 36. ^^^^e line coloring disclosed in A. H. Hasemi, el al, "Effi- 

The process for dividing the text section within the '^^^"^ Procedure Mapping Using Cache Line Coloring", 

rearrangeable library into the sections per the procedure will ^5 SIGPLAN, pp 171-182, June, 1997, is employed, 

be discussed hereinafter. ^t first, as a premise, each code of the object program 

At first, since a global attribute indicating that the own generated by the program transformation system and stored 

procedure is useful externally and symbol information relat- in the primary storage device is read out from the primary 

ing to attribute of the procedure and so forth are added at the storage device and thereafter, stored in the cache memory 

leading end of the procedure, these global attribute and the 20 consisted of four cache lines by the direct map method, 

symbol information are recognized as a leading label for In the source program to be compiled, seven procedures 

making reference to a leading address. A to G are described in sequential order. When the source 

Next, on the basis of recognition of the leading label of program is compiled into the object program, the code sizes 

each procedure, a section having distinct name per each of respective procedures A to G, number of cache lines to be 

procedure, such as "procedure name_source program, 25 occupied (cache line number) forming the cache memory is 

name" and so forth is newly generated. Then, information shown in FIG. 5. 

relating to the sections in the rearrangeable library is con- On the other hand, as result of dynamic parsing in the 

centrically and newly registered in a certain portion, such as profiler 37, it is assumed that frequency of call firom the 

a section header portion, in the rearrangeable library. If the procedure A to the procedure B is "90**, fi-equency of call 

text section is not necessary in the rearrangeable library, the 30 from the procedure B to the procedure C is "80**, frequency 

relevant information is deleted. of call firom the procedure C to the procedure D is "70, 

Since the rearrangeable library has an offset indicative of frequency of call from the procedure A to the procedure E is 

the position of various kinds of information in the rearrange- "40**, frequency of call from the procedure E to the proce- 

able library at respective portions, when the new section is dure C is "100**, frequency of call from the procedure E to 

added as set forth above, error is inherently caused in 35 the procedure F is "0", and frequency of call firom the 

correspondence between the information and offset. procediue F to the procedure G is "0**. 

Therefore, offset has to be updated. Each rearrangeable The arrangement method by cache line coloring reduces 

library processed as set forth above is stored in the second conflict on the cache memory in one generation (relationship 

program storage portion 32 as rearrangeable library arrange- of direct call from one procedure to the other procedure) 

able per procedure. 40 using a procedure call graph which will be discussed later. 

At step 302, the compiler 35 compiles the source program In the arrangement method, "color** is assigned for each 

into the reanangeable object program, and thereafter, per- cache line, and arrangement of the procedure is performed 

forms the process similar to the process of the library using number of "colors** required for arrangement, namely 

generating portion 43 at step 301 for transforming the cache line number, "colors** on which the procedures are 

rearrangeable library into the rearrangeable library arrange- 45 arranged, and non-useable groups. 

able per procedure to store in the second program storage In the shown embodiment, read (r) is assigned for the first 

portion 32, At step 303, the linker 36 establishes a link cache line, green (g) is assigned for the second cache line, 

between the rearrangeable object program arrangeable per blue (b) is assigned for the third cache fine, and yellow (y) 

procedure stored in the second program storage portion 32 is assigned for the fourth cache line, respectively. The 

and the rearrangeable library arrangeable per procedure 50 non-useable group are procedures in a relationship to call 

stored in the second library storage portion 42 to generate and to be called directly, and is referred to an aggregated 

executable temporary object program to store in the third group of the "colors" occupied by the already arranged 

program storage portion 33. procedures. 

At step 304, the profiler 37 collects dynamic information At first, at step 401 shown in FIG. 4, the optimizing 

consisted of call relationship between respective procedures, 55 portion 40 generates the procedure call graph as shown in 

call counts of respective procedures, loop structure infor- FIG. 6 on the basis of the dynamic information stored in the 

mation and so forth with executing the temporary object first information storage portion 38. In FIG. 6, nodes A to G 

program read out from the third program storage portion 33. represent procedures. Lines between the nodes represent call 

The profiler 37 stores the dynamic information thus obtained relationship of the procedures. Numerical values added for 

in the first information storage portion 38. 60 the lines represent call frequency of the call from start point 

At step 305, the optimizing portion 40 performs arrange- nodes, namely the procedures at the routes of arrows, to end 

ment optimization for all procedures on the basis of the point nodes, namely the procedures at the tip of the arrows, 

dynamic information stored in the first information storage At step 402, concerning the procedure call graph, lines 

portion 38 to generate the arrangement information to store and nodes are divided into a group having high call fre- 

in the second information storage jrartion 39. Detail of the 65 quency and a group having low call frequency. In the shown 

arrangement optimization of the procedure will be discussed embodiment, as can be appreciated from FIG. 6, the group 

later. having high call frequency is consisted of nodes A to E, the 
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line from the node A to the node B, the line from the node 
A to the node E, the line from the node B to the node C, the 
line from the node C to the node D and the line from the 
node E to the node C. On the other hand, the group having 
low call frequency is consisted of the nodes F and G, the line 5 
from the node E to the node F and the line from the node F 
to the node G. 

At step 403, the lines and the nodes are rearranged within 
each of the divided group. Namely, in the group having high 
call frequency, rearrangement is performed in descending 
order from the larger numerical values added to the lines. In 
contrast to this, in the group having low call frequency, 
rearrangement is performed in descending order from larger 
cache line number of the procedure and are arranged mainly 
for filling up void in the program space. 15 

In the shown embodiment, as can be appreciated from 
FIG. 6, in the group having high call frequency, the lines are 
arranged in the sequential order of the line from the node E 
to the node C, the line from the node A to the node B, the 
line from the node B to the node C, the hne from the node 
C to the node D and the line from the node A to the node E. 
On the other hand, in the group having low call frequency, 
as can be appreciated from FIG. 5, since the cache line 
number of the procedure G is 2 and the cache line number 
of the procedure F is 1, the nodes are arranged in sequential 
order of the node G and then the node F. 

At step 404, judgment is made whether the line of the 
group of high call frequency is left or not. When the result 
of judgment is positive ("YES"), the process is advanced to 
step 405. At this time, since the process is executed at the 
first time, all lines are left. Therefore, the result of judgment 
becomes "YES". 

At step 405, check is performed whether the nodes at both 
ends of the line at the highest order in the rearranged order 
among remaining lines are not yet arranged or not. If the 
result is YES, the process is advanced to step 406. In the 
shown case, the line of the highest order among the remain- 
ing lines is the line from the node £ to the line C, and the 
process is executed at the first time, the nodes E and C at 
both ends are not yet arranged. Accordingly, the result of 
judgment becomes "YES". 

At step 406, after arranging the nodes at both ends of the 
objective line adjacent with each other, the process is 
advanced to step 407. In this case, the nodes at both ends of 45 
the objective line can be arranged at arbitrary position in the 
program space. In this case, the cache line number of the 
procedures E and C are both 2, as can be appreciated from 
FIG. 5. As shown in the first row of FIG. 7, portions El and 
E2 of the procedure E are arranged on the first and second 
cache lines (colors are red (r) and green (g)). Portions CI and 
C2 of the procedure C are arranged on the third and fourth 
cache lines (colors are blue (b) and yellow (y)). In this case, 
the nodes E and C are considered to be marged to form a 55 
single node. Such single node wUl be referred to as com- 
posite node E-C. 

At step 407, after updating the groups which cannot be 
used, the process is retumed to step 404. 

In case of the node E, the "colors" of the cache lines, on 60 
which the node C is in a relationship to be directly called by 
the node E, are blue (b) and yellow (y), the non-use able 
group becomes E{b, y}. Similarly, in case of the node C, the 
"colors" of the cache lines, on which the node E is in a 
relationship to be directly called by the code C are red (r) and 65 
green (g). Therefore, the non -useable group becomes 
C{r,g}. 
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The processes of the steps 404 to 407 set forth above are 
repeated until no line which has the nodes at both ends left 
being not arranged, is left among the lines in the group 
having high call frequency. If not line is left in the group 
having high call frequency, the result of judgment at step 
404 becomes "NO". Then, process is advanced to step 416. 
In this case, since the line from the node A to the node B is 
left and the nodes A and B at both ends of the line are not 
yet arranged, the processes at steps 406 and 407 are per- 
formed. 

As can be appreciated from FIG. 5, both of the procedures 
A and B have cache line number 1. Therefore, as shown in 
the first row of FIG. 7, the procedure A is arranged on the 
third cache line (color is blue (b)), and the procedure B is 
arranged on the fourth cache line (color is yellow (y)). Then, 
the nodes A and B becomes a composite node A-B. Next, in 
case of the node A, the "color** of the cache line, on which 
the node B in a relationship to be directly called by the node 
A is yellow (y), the non-useable group becomes A{y}. 
Similarly, in case of the node B, the "color" of the cache line, 
on which the node A in a relationship to be directly called by 
the node B is blue (b), the non-useable group becomes B{b}. 

It should be noted that, in the procedure call graph shown 
in FIG. 6, despite of the fact that the line from the node A 
to the node E is left, since the cache line of the colors red (r) 
and green (g), on which the node E is arranged in the 
non-useable group of the node A, are not included. This is 
because the line from the node A to the node A has not been 
processed for low order of the call frequency. In the current 
status, conflict can be caused in connection with the line 
from the node A to the node E, process is performed 
according to the order of the original line. Thus, the current 
stams is acceptable. 

On the other hand, when the line in the group having high 
call frequency is left but the node on either side of the left 
line has already being arranged, the result of judgment at 
step 405 becomes "NO". The process is advanced to step 
408. 

At step 408, check is performed whether the line as an 
object of process is the line connecting nodes in two 
different composite nodes. If the result of checking at step 
408 is YES, the process is advanced to step 409. In the 
current condition, since the line directed from the node B to 
the node C, which line has the highest order among the 
remaining lines is the line connecting the composite node 
E-C and the composite node A-B, the result of judgment of 
the step 408 becomes YES. Therefore, the process is 
advanced to step 409. 

At step 409, concerning the line as process object, two 
composite nodes are marged into a single composite node. 
This is done by coupling the composite node having smaller 
number of marged nodes (hereinafter referred to as "shorter 
composite node") among two composite nodes to the com- 
posite nodes having greater number of marged nodes 
(hereinafter referred to as "longer composite node). Upon 
coupling the shorter composite node to the longer composite 
node, the shorter composite node is coupled to the longer 
composite node even in the program space. 

At first, it is determined which side of the longer com- 
posite node, the shorter composite node is to be arranged. 
Particularly, judgment is made among the nodes consisting 
the longer composite node, to which of the left and right 
boundary of the longer composite node, the center position 
of the nodes consisting the line to be the object for process 
is inclined, is judged by the cache line number required for 
reaching the left and right boundaries. Then, the shorter 
composite node is determined to be arranged on the side 
toward which the center position is inclined. 
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Next, an orientation to arrange the shorter composite node 
is determined and arranged. Particularly, the orientation of 
the shorter composite node is determined so that, among a 
plurality of nodes consisting the line to be object for process, 
the node other than the nodes consisting the longer com- 5 
posite node can be arranged as close as possible to the 
already arranged node in the longer composite node. In this 
case, if conflict is caused by arrangement of the shorter 
composite node, the positions of the nodes other than the 
nodes consisting the longer composite node are shifted away lO 
from the nodes consisting the longer composite node until 
the conflict is resolved. However, when the conflict cannot 
be avoided at any arrangement position of the nodes other 
than the nodes consisting the longer composite node, the 
arrangement positions of the nodes other than the nodes 15 
consisting the longer composite node are returned to the 
initial arrangement positions. Then, process is advanced to 
step 410. 

In the shown embodiment, both of the composite nodes 
E-C and A-B have two marged nodes. Therefore, both of the 20 
composite nodes can be taken as shorter composite nodes. 
However, in the shown case, the composite node A-B is 
taken as the shorter composite node. 

Among the nodes E and C consisting the longer compos- 
ite node E-C, the center position of the node C forming the 25 
line directed from the node B to the node C, which line is 
object for process, is located between the portions CI and 
C2 as shown in the first row of FIG. 7, Therefore, number 
of cache lines required for reaching to the left side boundary 
of the longer composite node E-C is three and whereas 30 
number of cache lines required for reaching to the right side 
boundary of the longer composite node E-C is one. 
Accordingly, the shorter composite node A-B is arranged on 
the right side of the longer composite node E-C. 

Next, orientation of the shorter composite node A-B is 35 
determined so that among the nodes B and C forming the 
line directed from the node B to the node C, which line is an 
object for process, the node B other than the node C 
consisting the longer composite node E-C, is located as close 
as possible to the node C which has already been arranged. 40 
Thus, the composite node becomes B-A. Since no conflict is 
caused in the arrangement set forth above, the anrangement 
is maintained as is (see second row of FIG. 7). By this, new 
composite node E-C-B-A is generated. 

At step 410, check is performed whether a vacant region 45 
is formed in the program space through the foregoing 
arrangement process or not. If the result of checking at step 
410 is "NO", the process is advanced to step '407. In the 
shown case, since no vacant region is formed, the process is 
advanced to step 407. Then, after updating the non-useable so 
group, the process is returned to step 404. 

In case of the node A, since the "color** of the cache line 
on which the node B in relationship of direct call is arranged, 
is arranged, is red (r), the non-useble group becomes A{r} . 
Similarly, in case of the node B, since the "color" of the 55 
cache line on which the node A in relationship of direct caU, 
is arranged, is green (g), and since the "color** of the cache 
line on which the node C in relationship of direct caU, is 
arranged, is blue (b) and yellow (y), the non-useable group 
becomes B{g, b, y} (see second row of FIG. 7). 60 

On the other hand, when the result of checking at step 410 
is "YES", namely when the vacant region in the program 
space is formed through the foregoing arrangement process, 
the process is advanced to step 411. 

At step 411, the node having high order in the group 65 
having low call frequency is arranged in the vacant region. 
Thereafter, process is advanced to step 407. 
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The processes at steps 404, 405, 408 to 411 and 407 as set 
forth above, are repeated until no line connecting the nodes 
in two mutually different composite nodes in the lines of the 
group having high call frequency and the node on either side 
has akeady been arranged, is found. Then, if no line of the 
group having high call frequency is left, the result of 
checking at step 404 becomes "NO**. Then, the process is 
advanced to step 416. 

If there is no line connecting the nodes in two mutually 
different composite nodes in the lines of the group having 
high call frequency and the node on either side has already 
been arranged, the result of checking at step 408 becomes 
"NO**. Then, process is advanced to step 412. 

Through the process set forth above, the line from the 
node E to the node C, the line from the node A to the node 
B and the line from the node B to the node C are processed. 
Therefore, in the group having high call frequency, the line 
from the node C to the node D and the line from the node 
A to the node E are left. However, these lines are not the line, 
in which one of the nodes has already been arranged and 
which docs not connect two different composite nodes. 
Therefore, the result of checking at step 408 becomes "NO**. 
Then, process is advanced to step 412. 

At step 412, check is performed whether one node among 
two nodes consisting the line to be object for process 
consists the composite node, and the other node has not yet 
been arranged, or not. If the result of checking at step 412 
is "YES**, the process is advanced to step 413. In the present 
case, the line from the node C to the node D, which line has 
the highest order among the remaining lines, has the node C 
consisting the composite node E-C-B-A and the node D 
which is not yet arranged. Therefore, the result of checking 
at step 412 becomes "YES*'. Then, process is advanced to 
step 413. 

At step 413, the non-arranged node of the line as object 
for process is coupled with the composite node. Upon 
coupling the non-arranged node to the composite node, the 
non-arranged node is also coupled with the composite node 
even on the program space. 

At first, it is determined which side of the composite node, 
the non-arranged node is to be arranged. Particularly, judg- 
ment is made among the nodes consisting the composite 
node, to which of the left and right boimdary of the com- 
posite node, the center position of the nodes consisting the 
line to be the object for process is inclined, is judged by the 
cache line number required for reaching the left and right 
boundaries. Then, the non-arranged node is determined to be 
arranged on the side toward which the center position is 
inclined. 

In this case, if conflict is caused by arrangement of the 
non-arranged node, the positions of the nodes other than the 
nodes consisting the composite node are shifted away from 
the nodes consisting the composite node until the conflict is 
resolved. However, when the conflict cannot be avoided at 
any arrangement position of the nodes other than the nodes 
consisting the composite node, the arrangement positions of 
the nodes other than the nodes consisting the composite 
node are returned to the initial arrangement positions. Then, 
process is advanced to step 410. 

Among the nodes E, C, B and A consisting the composite 
node E-C-B-A, the center position of the line fix)m the node 
C to the node D, which line is object for process, is located 
between the portions CI and C2 as shown in the first row of 
FIG. 7. Therefore, number of cache lines required for 
reaching to the left side boundary of the composite node 
E-C-B-A is three and whereas number of cache lines 
required for reaching to the right side boundary of the 
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composite node E-C-B-A is three. Accordingly, the non- the highest order among the remaining Hnes has both the 

arranged node can be arranged on either of left and right side nodes A and E consisting the composite iode E-C-B-A and 

of the composite node E-C-B-A. In the shown case, the node the node E is not yet arranged. Therefore, the result of 

D is determined to be arranged on left side of the composite checking at step 414 becomes "YES". Then, process is 

node E-C-B-A. 5 advanced to step 415. 

In this case, when portions Dl and D2 of the nodes D are ^ step 415, conflict between the nodes consisting the line 

arranged on immediate left side of a portion El of the node object for process is eliminated. Namely, if conflict is 

E, conflict can be caused between the portions Dl and D2 of between the nodes consisting the Ime as object for 

the node D and the portions CI and C2 of the node C. f/"^^^ ^^^^.^^^ """^^'u^^T^ ""^""J^u 'u^ ^T "^^ 

™ , „i * J n- * 4U r»i ^ T^-i f *u the composite node is shifted beyond the boundary until 

Therefore, m order to avoid c«nflicUheDlimdD2 of lO However, when conflict cZot be 

node D are arranged at a distance of two cache hnes from the ^^^^y^^ ^^j^, jjj^^ ^ ^^^^^ 

portion El of the node E (see third row of FIG 7). ^ j^jji^,, ^^^^ ^^^^^ advanced 

Next, in such case, vacant region corresponding to two g^^p ^jq 

cache lines is formed on the right side of the portions Dl and s^own embodiment, the hne as object for process 

D2 of the node D. Then, the result of checking at step 410 15 is the line from the node A to the node E. As can be 

becomes "YES". Thus, the process is advanced to step 411. appreciated from the fourth row of FIG. 7, conflict is caused 

At step 411, in the vacant region for two cache lines on the between the nodes A and E. Among the nodes A and E 

right side of the portions Dl and D2 of the node D, the node consisting the line from the node A to the node E, the node 

having the highest order in the group having low call A is closer to the boundary of the composite node E-C-B-A. 

frequency is arranged. In the shown case, the node G is 20 Therefore, the node A is shifted beyond the boundary. In the 

arranged in the vacant region set forth above (see fourth row shown case, since conflict can be avoided by shifting the 

in FIG. 7). Then, the process is advanced to step 407, after node A for one cache line, the node A is arranged at the one 

updating the non -useable group. Thereafter, the process is cache line shifted position (see the fifth row of FIG. 7). 

returned to step 404. In case of the node D, the "colors" of Next, in the shown case, the vacant region of one cache 

the cache Hnes, on which the node C is in a relationship to 25 line is formed on the Tight side of the node A. Therefore, the 

be directly called by the node D, are blue (b) and yellow (y), result of judgments at step 410 becomes "YES". Then, the 

the non-useable group becomes D{b, y} (see third row of process is advanced to step 411. 

FIG. 7). At step 411, in the vacant region for one cache line on the 
The processes at steps 404, 405, 408, 412, 413, 410, 411 right side of the node A, the node remained in the group 
and 407 as set forth above, are repeated until no line not 30 having low call frequency is arranged. In the present case, 
connecting the nodes in two mumally different composite after arranging the node F (see the sixth row of FIG. 7), the 
nodes but connecting one of the nodes on the opposite sides process is advanced to step 407. After updating the non- 
is the node consisting the composite node and the node on useable group, the process is returned to step 404. In case of 
the other side is not yet arranged in the remaining lines of the the node A, the "colors" of the cache line, on which the 
group having high caU frequency and the node on either side 35 nodes E and B in a relationship to be directly called by the 
has already been arranged, is found. Then, if no line of the node A are arranged, are red (r) and green (g), the non- 
group having high caU frequency is left, the result of useable group becomes A{r, g} (see fifth row of FIG. 7). 
checking at step 404 becomes "NO". Then, the process is Similarly, in case of the node B, the "color" of the cache line, 
advanced to step 416. on which the node C in a relationship of directly calling the 
If no line which does not connect the nodes in two 40 node B and the node A in a relationship of being directly 
mutually different composite nodes but does connect one of called by the node B, are arranged, are blue (b) and yellow 
the nodes on the opposite sides is the node consisting the (y). Thus, the non-useable group becomes B{b, y} (see the 
composite node and the node on the other side is not yet fifth row of FIG. 7), 

arranged in the remaining lines of the group having high caU The process of the steps 404. 405, 408, 412, 414, 415, 

frequency and the node on either side has already been 45 410, 411 and 407 set forth above is repeated until no line 

arranged, is found. The result of checking at step 412 connecting the nodes in the same composite node is left. If 

becomes "NO". Then, the process is advanced to step 413. no line in the group having high call frequency is left, the 

Through the process set forth above, the line from the result of judgment at step 404 becomes "NO". Hien, the 

node E to the node C, the line from the node A to the node process is advanced to step 416. 

B, the line from the node B to the node C and the hne from so At step 416, concerning the remaining nodes in the group 
the node C to the node D are processed. Therefore, in the having low caU frequency, arrangement is performed by 
group having high call frequency, only line from the node A simple depth preferential retrieval. When a plurality of 
to the node E is left. However, this lines are not the line, composite nodes are arranged away from each other through 
which does not connect the nodes in two mutually different the process set forth above, preference is given for each 
composite nodes but does connect one of the nodes on the 55 composite node on the basis of the call frequency to deter- 
opposite sides is the node consisting the composite node and mine final arrangement. Then, a sequence of process goes 
the node on the other side is not yet arranged in the END. 

remaining lines of the group having high call frequency and One example of the arrangement information to be 

the node on either side has already been arranged. Therefore, obtained through the arrangement optimizing process of the 

the result of checking at step 412 becomes "NO". Then, 60 procedures set forth above wiU be illustrated in FIG. 8. 

process is advanced to step 414. As a premise, it is assumed that the procedures A and B 

At step 414, check is performed whether one node among are included in a source program file of a file name "testl.o", 

two nodes consisting the line to be object for process functions E, F and G are included in a source program file 

consists the composite node, and the other node has not yet of a file name "test2.o", the procedures C and D are standard 

been arranged, or not. If the result of checking at step 412 65 library procedures included in a library file of file name 

is "YES", the process is advanced to step 415. In the present "libc.a". Also, a size of one cache line is assumed to be 32 

case, the line from the node A to the node E, which line has bjrtes (0x20). 
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In FIG. 8, "GROUPl" is a segment name which is given 
in the case when the output section is handled as one group. 
"ILOAD" represents a segment type. This field is fixed. In 
the shown case, "LOAD" represents the segment to be 
loaded in the memory. "?RX** represents a segment attribute 5 
which shows attribute of read/writc/execute of the segment. 
In case of an instruction portion (text code), it is fixed at 
"?RX". "AOxlOOO" represents alignment condition which 
represents the alignment condition upon arrangement of the 
segment in the memory space. In the shown case, there is lO 
illustrated the case where the alignment condition is 
"0x1000", 

On the other hand, "_D_UB'*, "„G_test2" and so forth 
are output section names which represent groups formed by 
coupling input sections of the same type and attribute. 15 
"SPROGBITS" represents the type of the input section. In 
case of the text code, the type of the input section is fixed to 
"$PROGBITS". "?AX" represents a section attribute which 
represents attribute of occupyAvriteable/executable or so 
forth of the memory. In case of the text code, it is fixed at 20 
"?AX". 

"A0x20" represents an alignment condition upon arrang- 
ing the input section in the output section. Since consider- 
ation is given for the arrangement per cache line, the 
aUgnment condition is 0x20 as the size of one cache line. 25 

"_D_testl", "_G test2" and so forth represent names of 

the input sections to be arranged in the output section, 
"libc.a", "test2.0" and so forth represent file names included 
in the input section. When the output section is formed by 
aggregating the same input sections of a plurality of files, it 30 
is possible to describe a plurality of file names. 

As set forth above, by providing input section name per 
the procedure, the order of arrangement of the procedure can 
be designated with arrangement condition. As set forth 
above, with the construction of the shown embodiment, the 35 
library generating portion 43 for transforming the rearrange- 
able library into the rearrangeable library arrangeable per 
procedure is provided for collecting dynamic information 
with respect to all procedures upon dynamic parsing by the 
profiler 37, generates the arrangement information deter- 40 
mining the optimal arrangement for all of the procedures on 
the basis of the dynamic information and arranges all 
procedures on the basis of the arrangement information. 
Therefore, conflict on the cache memory between all pro- 
cedures consisting the object program can be reduced. In 45 
conjunction therewith, cache miss of frequently used pro- 
cedure can be reduced. By this, execution speed can be 
speeded up upon execution of the object program by the 
computer or CPU. 

In this respect, when seven procedures are included in the 50 
source program to be compiled and described in the sequen- 
tial order, number of cache lines of respective procedures A 
to G when the source program is complied into the object 
program, are as shown in FIG. 5, if the arrangement opti- 
mization process of the procedure by the optimizing portion 55 
40 is not provided at all, the procedures A to G in the source 
program is compiled into the object program in the 
described order. Therefore, conflict is caused between the 
procedures C and E as shown in FIG. 9. 

On the other hand, in the present invention, without 60 
distinguishing the kinds of the procedures, all procedures are 
handled equally and arrangement is possible per procedures, 
possibility of completely eliminating conflict can be high. In 
procedure call graph shown in FIG. 6, as shown in FIG. 10, 
the procedures C and D are standard library procedures. As 65 
in the prior art, if these procedures C and D are placed to be 
out of object for arrangement optimization process of the 
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procedure, even when the arrangement method by cache line 
coloring as disclosed in the above-identified publication, it 
is not possible to completely avoid conflict for the reason set 
out below. When a call instruction of a plurality of standard 
library procedures are described in the source program, a 
plurality of corresponding standard library procedures are 
read out from the library storage portion upon establishing 
a link in the linker and are concentrically arranged in the 
special region of the primary storage device in the prior art. 
It has been not possible to designate the arrangement per 
procedure. 

FIG. 11 shows a process of the arrangement optimization 
process of the procediu-e by the optimizing portion in the 
case where the standard library procedures C and D are 
placed out of arrangement optimization process of the 
procedure. In this case, since the standard library procedures 
C and D are placed out of arrangement optimization process, 
the line from the procedure E to the procedure C and the line 
from the procedure B to the procedure C are naturaUy placed 
out of process. Accordingly, as shown in the flfth row of 
FIG. 11, conflict of the procedure E and the procedure C can 
be avoided. 
(Second Embodiment) 

Next, discussion will be given for the second embodiment 
of the program transformation system according to the 
present invention. 

FIG. 12 is a block diagram showing a construction of the 
second embodiment of the program transformation system 
according to the present invention. In FIG. 12, like elements 
to those illustrated in FIG. 1 will be identified by like 
reference numerals and detailed discussion for such com- 
mon element wiU be neglected in order to avoid redundant 
disclosure for keeping the disclosure simple enough to 
facilitate clear understanding of the present invention. 

In the shown embodiment of the program transformation 
system shown in FIG. 12, the rearrangeable library stored in 
the first library storage portion 41 is also supplied to the 
linker 36 in addition to the library generating portion 43. The 
following is the reason why such construction is provided. 

Namely, when number of the rearrangeable libraries is 
large, it takes a long period for transforming all of the 
rearrangeable libraries stored in the first library storage 
portion 41 into the rearrangeable libraries arrangeable per 
procedure. 

Therefore, concerning a part of the rearrangeable 
libraries, without U-ansforming the reanangeable libraries 
into the rearrangeable libraries arrangeable per procedm-e by 
the library generating portion 43, direct link to the rear- 
rangeable libraries arrangeable per procedure is established 
by the linker 36. 

In this case, the rearrangeable library to be directly 
supplied to the linker 36 may be judged by the linker 36 on 
the basis of the dynamic information stored in the first 
information storage portion 38 or the code size of each 
procedure, for example. 

As set forth above, with the construction set forth above, 
the final object program can be generated at a shorter period 
than that in the first embodiment. 
(Third Embodiment) 

Next, discussion will be given for the third embodiment 
of the program U-ansformation system according to the 
present invention. 

FIG. 13 is a block diagram showing a construction of the 
third embodiment of the program transformation system 
according to the present invention. In FIG. 13, like elements 
to those illustrated in FIG. 12 wiU be identified by like 
reference numerals and detailed discussion for such com- 
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mon element will be neglected in order to avoid redundant At step 1406, the compiler 35 compiles the source pro- 
disclosure for keeping the disclosure simple enough to gram into the rearrangeable object program and thereafter 
facilitate clear understanding of the present invention. transforms into the rearrangeable object program arrange- 
In the third embodiment of the program transformation able per procedure in the similar process to the process of the 
system, in place of the compiler 35 and the profiler 37 shown 5 library generating portion 43 at step 1405, to store in the 
in FIG. 12, a compiler 44 and a profiler 45 are newly ^^^^ program storage portion 32. 
provided. 5tep 1407, the linker 36 establishes a link between the 
Different from the profiler 37 shown m FIG. 12, the rearrangeable object program arrangeablc per procedure and 
profiler 45 has only function for simply reading the execut- t^e rearrangeable library arrangeable per procedure on the 
able temporary object program stored in the third program of the arrangement information stored in the second 
storage portion 33 and executing the read out temporary information storage portion 39 to generate the executable 
object program. On the other hand, the compiler 44 inserts final object program to store in the fourth program storage 
a count code counting number of times of procedures to be portion 34. Thereafter, the sequence of process goes END. 
actually executed upon executing the temporary object pro- With the construction of the shown embodiment set forth 
gram by the profiler 45 to the temporary object program 15 above, even if the profiler 45 does not have a function for 
upon compiling the source program into the executable collecting dynamic information, substantially the same 
temporary object program. BY this, the profiler 45 can ^^^^^ ^® embodiment can be obtained, 
collect dynamic information consisted of call relationship , Although the presem invention has been illustrated and 
, , , 11 . c J described with respect to exemplary embodiment thereof. It 
between the procedures, call counts of respective procedures .u. j. jl.l 1 -n j ■ l t , 

. , , . ' r. ^ ,20 should be understood by those skilled in the art that the 

and so forth by executing the temporary object program, and foregoing and various other changes, omissions and addi- 

stores the obtained dynamic information in the first infor- ti^^s may be made therein and thereto, without departing 

mation storage portion 38. firom the spirit and scope of the present invention. Therefore, 

Next, discussion will be given for operation of the third the present invention should not be understood as limited to 

embodiment oil the program transformation system con- 25 the specific embodiment set out above but to include all 

structed as set forth above with reference to FIG. 14. possible embodiments which can be embodied within a 

At first, at step 1401 shown in FIG. 14, the compiler 44 scope encompassed and equivalents thereof with respect to 

complies the source program (see FIG. 2) read out from the the feamre set out in the appended claims, 

first program storage portion 31 into the executable tempo- For example, while the present invention has been illus- 

rary object program with inserting the count code to store in 30 trated and discussed in terms of an example as applied for 

the second program storage portion 32. generating one final object program from one source pro- 

At step 1402, the linker 36 establishes a link between the gram in the foregoing embodiment, the present invention 

temporary object program inserted the count code stored in should not be limited to the disclosed embodiments, 

the second program storage portion 32 and the rearrangeable Namely, the present invention is applicable for the case 

library stored in the first library storage portion 41 to 35 where a plurality of source programs are complied into 

generate an executable temporary object program to store in respective rearrangeable object programs and then thus 

the third program storage portion 33. generated rearrangeable object programs are linked by the 

At step 1403, the profiler 45 executes the temporary linker 36 to generate one final object program, as a matter of 

object program read out from the third program storage course. 

portion 33. In this case, since the count code is inserted in 40 On the other hand, in respective embodiments set forth 

the temporary object program, dynamic information con- above, respective program storage portions 31 to 34, respec- 

sisted of call relationship between the procedures, call count live information storage portions 38 and 39 and library 

of each procedure and so forth are collected. The obtained storage portions 41 and 42 are constructed with mutually 

dynamic information is stored in the first information stor- different storage media. However, the present invention is 

age portion 38. 45 not limited to the shown embodiments. Namely, respective 

At step 1404, the optimizing portion 40 generates the storage portions may be formed by different storage regions 

arrangement information by performing arrangement opti- of a common storage medium. 

mization for all procedures on the bais of the dynamic In this case, respective program storage portions 31 to 34 

information stored in the first information storage portion 38 and library storage portions 41 and 42 are programs or 

to store in the second information storage portion 39. The 50 rearrangeable libraries requiring relatively large storage 

process of the step 1404 is substantiaUy similar to the ^'P'^'*>:/°5l ^^"^ may be constructed with FD HD or 

process of the step 305 in the first embodimem. Thus, CD-ROM/ On the other hand, smce respective information 

J J J. - r t.. . MIL 1 J. J storage portions 38 and 39 are data requinng relatively small 

detailed discussion for this step will be neglected m order to ^^^^^^^ ^^^^^^^ ^^^^^ information storage portions 38 and 

avoid redundant disclosure for keeping the disclosure simple 35 39 ^^^y former with the semiconductor memory, such as 

enough to facilitate clear understanding of the present inven- RAM, ROM or so forth. 

tion. On the other hand, respective means in the foregoing 

At step 1405, the fibrary generating portion 43 transforms embodiments are illustrated as hardware construction. The 

each rearrangeable Ubrary stored in the first library storage present invention should not be limited to the constructions 

portion 41 into the reanangeable library arrangeable per 60 shown in the illustrated embodiments. Namely, it is possible 

procedure by recognizing per procedure to store in the to form the program transformation system with a computer 

second Ubrary storage portion 42. The process of the step having CPU (central processing unit), an internal storage 

1405 substantially similar to the process at step 301 in the device, such as ROM, RAM or so forth, an external storage 

first embodiment. Thus, detailed discussion for this step will device, such as FDD (floppy disk drive), HDD (hard disk 

be neglected in order to avoid redundant disclosure for 65 drive), CD-ROM drive and so forth, output means and input 

keeping the disclosure simple enough to facilitate clear means, and the foregoing compilers 35 and 44, the linker 36 

understanding of the present invention, and the profilers 37 and 45 may be constructed with CPU, 
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a ad a program transformation program realizing the fore- 
going functions may be stored in the storage medium, such 
as the semiconductor memory including ROM or the like, 
FD, HD, CD-ROM or so forth. 

In this case, the foregoing internal storage device or the 
external storage device serves as respective program storage 
portions 31 to 34, respective information storage portions 38 
and 39 and the library storage portions 41 and 42. The 
program transforming program is loaded to CPU from the 
storage medium to control CPU operation. CPU is respon- 
sive to triggering of the program transforming program to 
serve as the compilers 35 and 44, the linkers 36 and the 
profilers 37 and 45. Under control of the program trans- 
forming program, the foregoing process is executed. 

As set forth above, with the construction of the present 
invention as set forth above conflict between various pro- 
cedures on the cache memory can be avoided. Also, cache 
miss of the frequently used procedure can be prevented. By 
this, upon execution of the computer or CPU, execution 
speed can be accelerated. 

Although the invention has been illustrated and described 
with respect to exemplary embodiment thereof, it should be 
understood by those skilled in the art that the foregoing and 
various other changes, omissions and additions may be 
made therein and thereto, without departing from the spirit 
and scope of the present invention. Therefore, the present 
invention should not be understood as limited to the specific 
embodiment set out above but to include all possible 
embodiments which can be embodies within a scope encom- 
passed and equivalents thereof with respect to the feature set 
out in the appended claims. 

What is claimed is: 

1. A program transformation method for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

first process of transforming at least a part of procedure, 
function or sub-routine used in said source program 
into a form so that said object program can be stored in 
an arbitrary storage region of a primary storage device 
of said data processing system; 

second process of arranging procedure, function or sub- 
routine transformed or not transformed in said first 
process in said storage region corresponding to cache 
line of a cache memory among storage region of said 
primary storage device without causing cache conflict 
on the basis of information relating to said procedure, 
function or sub-routine obtained during a process of 
transformation of said source program into said object 
program; and 

third process of generating said object program, on the 
basis of the result of arrangement. 

2. A program transformation method as set forth in claim 
1, wherein said procedure, function or sub-routine is at least 
one of that defined by a user in said source program, that 
defined and inspected by the user, that preliminarily pre- 
pared in a processing system in said programming language 
and that preliminarily prepared in a form of instruction code. 

3. A program transformation method as set forth in claim 
1, wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines. 
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4. A program transformation method as set forth in claim 
1, wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines, 

in said second process, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

5. A program transformation method for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

first process of transforming at least a part of procedure, 
function or sub-routine used in said source program 
into a form for storing in an arbitrary storage region of 
a primary storage device of said data processing system 
when said object program is used in said data process- 
ing system; 

second process of transforming said source program into 
said object program, and in conjunction therewith, and 
concerning said object program, transforming 
procedure, function or sub-routine defined by a user in 
said source program into a form storable in arbitrary 
region of said primary storage device; 

third process of linking the procedure, function or sub- 
routine transformed into said first process and the 
object program obtained in said second process; 

fourth process of collecting dynamic information con- 
sisted of information indicative of number of times that 
the procedure, function or sub-routine is actually 
called, and information indicative of call relationship 
between said procedure, function or sub-routine with 
executing the object program obtained through said 
third process; 

fifth process of arranging said procedure, function or 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

sixth process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said first process and the object program 
obtained in said second process, on the basis of said 
arrangement information. 

6. A program transformation method as set forth in claim 
5, wherein said procedure, function or sub-routine is at least 
one of that defined by a user in said source program, that 
defined and inspected by the user, that preliminarily pre- 
pared in a processing system in said programming language 
and that preliminarily prepared in a form of instruction code. 

7. A program transformation method as set forth in claim 
5, wherein in said second process, 

said procedures, functions or sub-routines are divided into 
a plurahty of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 
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8. A program transformation method for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

first process of transforming said source program into a 
temporary object program, and in conjunction 
therewith, upon executing said temporary object 
program, inserting a code for counting number of times 
that said procedure, function or sub-routine is actually 
called; 

second process of linking one of the procedure, function 
or sub-routine that defined by a user in said source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in said 
programming language and that preliminarily prepared 
in a form of instruction code, with said temporary 
object program obtained through said first process; 

third process of collecting dynamic information consisted 
of information indicative of number of times that the 
procedure, function or sub-routine is actually called, 
and information indicative of call relationship between 
said procedure, function or sub-routine with executing 
the object program obtained through said second pro- 
cess; 

fourth process of arranging said procedure, function or 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

fifth process of transforming at least part of one defined by 
a user in said source program, one defined and 
inspected by the user, one preliminarily prepared in a 
processing system in said programming language and 
one preliminarily prepared in a form of instruction code 
among the procedure, function or sub-routine to be 
used in said source program into a form storable in an 
arbitrary storage region of said primary storage region, 
in which said object program is stored as actually used 
in said data processing system; 

sixth process, after transforming said source program into 
said object program, concerning said object program, 
transforming procedure, function or sub-routine 
defined by a user in said source program into a form 
storable in arbitrary region of said primary storage 
device; 

seventh process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said fifth process and the object program 
obtained in said sixth process, on the basis of said 
arrangement information. 

9. A program transformation method as set forth in claim 
8, wherein in said fourth process, 

said procedures, functions or sub-routines are divided into 
a plurahty of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

10. A program transformation system for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 
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procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in said 
source program into a form so that said object program 
can be stored in an arbitrary storage region of a primary 
5 storage device of said data processing system; 

optimizing means for arranging procedure, function or 
sub-routine transformed or not transfonned in said 
procedure transforming means in said storage region 
corresponding to cache line of a cache memory among 
storage region of said primary storage device without 
causing cache conflict on the basis of information 
relating to said procediue, function or sub-routine 
obtained during a process of transformation of said 
15 source program into said object program; and 

generating means for generating said object program, on 
the basis of the result of arrangement. 

11. A program transformation system as set forth in claim 
10, wherein said procedure, function or sub-routine is at 
least one of that defined by a user in said source program, 
that defined and inspected by the user, that preliminarily 
prepared in a processing system in said programming lan- 
guage and that preliminarily prepared in a form of instruc- 
tion code. 

12. A program transformation system as set forth in claim 
10, wherein said information is obtained by execution of a 
temporary object program transformed from said source 
program and is consisted of information indicative of num- 

3Q ber of times that said procedure, function or sub-routine is 
actuaUy called and information indicative of call relation- 
ship between procedures, functions or sub-routines. 

13. A program transformation system as set forth in claim 
10, wherein said information is obtained by execution of a 

35 temporary object program transformed from said source 
program and is consisted of information indicative of num- 
ber of times that said procedure, function or sub-routine is 
actually called and information indicative of call relation- 
ship between procedures, functions or sub-routines, 
^ in said optimizing means, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 
said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 

14. A program transformation system for transforming a 
source program described by a programming language into 
an object program described by a language executable by a 
data processing system, comprising: 

procedure transforming means for transforming at least a 
part of procedure, function or sub-routine used in said 
source program into a form for storing in an arbitrary 

55 storage region of a primary storage device of said data 
processing system when said object program is used in 
said data processing system; 
program transforming means for transforming said source 
program into said object program, and in conjunction 

60 therewith, and concerning said object program, trans- 
forming procedure, function or sub-routine defined by 
a user in said source program into a form storable in 
arbitrary region of said primary storage device; 
linking means for linking the procedure, function or 

65 sub-routine transformed into said procedure transform- 
ing means and the object program obtained in said 
program transforming means; 
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dynamic information collecting means for collecting said program transforming means transforming said 

dynamic information consisted of information indica- source program into said object program, concerning 

live of number of times that the procedure, function or g^id object program, transforming procedure, function 

sub-routine is actually called, and information indica- . . • j c j l • • j 

auu- iviuiiuv 1^ aviuaiijr Tj j r. Of sub-routme defined by a user m Said sourcc orogram 

tive of call relationship between said procedure, rune- s . . . , . , . . - . , . 

tion or sub-routine with executing the object program "^'^ ^ ^torable m arbitrary region of said primary 

obtained through said linking means; and storage device, and said linking means generatmg a 

optimizing means for arranging said procedure, function final object program by linking said procedure, func- 

or sub-routine in said storage region corresponding to tion or sub-routine transformed in said procedure trans- 

the cache Hne of the cache memory among the storage jp forming means and the object program obtained in said 

region of said primary storage device with avoiding program transforming means, on the basis of said 

cache conflict, on the basis of said dynamic informa- arrangement information. 

said^U^king means generating a final object program by , 17- Aprogram transformation system as set forth in claim 

linking said procedure, function or sub-routine trans- ^^^^^^"^ ^° said optimizmg means, 

formed in said procedure transforming means and the said procedures, fiinctions or sub-routines are divided into 

object program obtained in said program transforming a plurality of groups on the basis of call frequency, and 

means, on the basis of said arrangement information. said procedures, functions or sub-routines are arranged in 

15. A program transformation system asset forth in claim said storage region corresponding to cache lines of a 
14, wherein in said optimizing means, cache memory among the storage region of said pri- 

said procedures, functions or sub-routines are divided into mary storage device. 

apluralityofgroupsonthebasisof call frequency, and 18. A computer readable memory storing a language 

said procedures, functions or sub-routines are arranged in processing program for transforming a source program 
said storage region corresponding to cache lines of a described by a programming language into an object pro- 
cache memory among the storage region of said pri- gram described by a language executable by a data process- 
mary storage device. ing system, said language processing program comprising: 

16. A program transformation system for transforming a first process of transforming at least a part of procedure, 
source program described by a programming language into function or sub-routine used in said source program 
an object program described by a language executable by a into a form so that said object program can be stored in 
data processing system, comprising: an arbitrary storage region of a primary storage device 

program transforming means for transforming said source said data processing system; 

program into a temporary object program, and in con- second process of arranging procedure, function or sub- 
junction therewith, upon executing said temporary routine transformed or not transformed in said first 
object program, inserting a code for coimting number process in said storage region corresponding to cache 
of times that said procedure, function or sub-routine is ^5 a cache memory among storage region of said 
actually called; primary storage device without causing cache conflict 

linking means for linking one of the procedure, function the basis of information relating to said procedure, 

or sub-routine that defined by a user in said source function or sub-routine obtained during a process of 

program, that defined and inspected by the user, that transformation of said source program into said object 

preliminarily prepared in a processing system in said ^ program; and 

programming language and that preliminarily prepared third process of generating said object program, on the 

in a form of instruction code, with said temporary basis of the result of arrangement, 

object program obtained through said program trans- 19. A computer readable memory as set forth in claim 18, 

forming means; wherein said information is obtained by execution of a 
dynamic information collecting means for collecting 45 temporary object program transforrned from said source 

dynamic information consisted of information indica- pnDgram and is consisted of information indicative of num- 

tive of number of times that the procedure, function or ber of times that said procedure, function or sub-routine is 

sub-routine is actuaUy called, and information indica- actually caUed and information indicative of call relation- 

tive of call relationship between said procedure, func- ship between procedures, functions or sub-routines, 
tion or sub-routine with executing the object program 50 iri said second process, 

obtained through said linking means; said procedures, fiinctions or sub-routines are divided into 

optimizing means for arranging said procedure, function a plurality of groups on the basis of call frequency, and 

or sub-routine in said storage region corresponding to said procedures, functions or sub-routines are arranged in 

the cache line of the cache memory among the storage said storage region corresponding to cache lines of a 

region of said primary storage device with avoiding 55 cache memory among the storage region of said pri- 

cache conflict, on the basis of said dynamic infonma- mary storage device. 

tion; and 20. A computer readable memory storing a language 
procedure transforming means for transforming at least processing program for transforming a source program 
part of one defined by a user in said source program, described by a programming language into an object pro- 
one defined and inspected by the user, one preliminarily 60 gram described by a language executable by a data process- 
prepared in a processing system in said programming ing system, said language processing program comprising: 
language and one preliminarily prepared in a form of first process of transforming at least a part of procedure, 
instruction code among the procedure, function or function or sub-routine used in said source program 
* sub-routine to be used in said source program into a into a form for storing in an arbitrary storage region of 
form storable in an arbitrary storage region of said 65 a primary storage device of said data processing system 
primary storage region, in which said object program is when said object program is used in said data process- 
stored as actually used in said data processing system; ing system; 
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second process of transforming said source program into 
said object program, and in conjunction therewith, and 
concerning said object program, transforming 
procedure, function or sub-routine defined by a user in ^ 
said source program into a form slorable in arbitrary 
region of said primary storage device; 

third process of linking the procedure, function or sub- 
routine transformed into said first process and the 
object program obtained in said second process; ^0 

fourth process of collecting dynamic information con- 
sisted of information indicative of number of times that 
the procedure, function or sub-routine is actually 
called, and information indicative of call relationship 15 
between said procedure, function or sub-routine with 
executing the object program obtained through said 
third process; 

fifth process of arranging said procedure, function or 20 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

sixth process of generating a final object program by 
linking said procedure, function or sub-routine trans- 
formed in said first process and the object program 
obtained in said second process, on the basis of said 
arrangement information. 

21. A computer readable memory as set forth in claim 20, 
wherein in said second process, 

said procedures, functions or sub-routines are divided into 35 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. ^ 

22. A computer readable memory storing a language 
processing program for transforming a source program 
described by a programming language into an object pro- 
gram described by a language executable by a data process- 
ing system, said language processing program comprising: 

first process of transforming said source program into a 
temporary object program, and in conjunction 
therewith, upon executing said temporary object 
program, inserting a code for counting number of times ^0 
that said procedure, function or sub-routine is actually 
called; 
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second process of linking one of the procedure, function 
or sub-routine that defined by a user in said source 
program, that defined and inspected by the user, that 
preliminarily prepared in a processing system in said 
programming language and that preliminarily prepared 
in a form of instruction code, with said temporary 
object program obtained through said first process; 

third process of collecting dynamic information consisted 
of information indicative of number of times that the 
procedure, function or sub-routine is actually called, 
and information indicative of call relationship between 
said procedure, function or sub-routine with executing 
the object program obtained through said second pro- 
cess; 

fourth process of arranging said procedure, function or 
sub-routine in said storage region corresponding to the 
cache line of the cache memory among the storage 
region of said primary storage device with avoiding 
cache conflict, on the basis of said dynamic informa- 
tion; and 

fifth process of transforming at least part of one defined by 
a user in said source program, one defined and 
inspected by the user, one preliminarily prepared in a 
processing system in said programming language and 
one preliminarily prepared in a form of instruction code 
among the procedure, function or sub-routine to be 
used in said source program into a form storable in an 
arbitrary storage region of said primary storage region, 
in which said object program is stored as actually used 
in said data processing system; 

sixth process, after transforming said source program into 
said object program, concerning said object program, 
transforming procedure, function or sub-routine 
defined by a user in said source program into a form 
storable in arbitrary region of said primary storage 
device; 

seventh process of generating a final object program by 
linking said procedure, function or sub- routine trans- 
formed in said fifth process and the object program 
obtained in said sixth process, on the basis of said 
arrangement information. 

23. A computer readable memory as set forth in claim 22, 
wherein in said fourth process, 

said procedures, functions or sub-routines are divided into 
a plurality of groups on the basis of call frequency, and 

said procedures, functions or sub-routines are arranged in 
said storage region corresponding to cache lines of a 
cache memory among the storage region of said pri- 
mary storage device. 
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